Low Hardware Overhead Scan Based 3-Weight Weighted 
Random BIST Architectures 

I. DESCRIPTION 

The present Application claims priority from the co-pending U.S. 
Provisional Patent Application Serial No. 60/266,845, the contents of 
which are incorporated herein by reference. 

5 LA. Field 

This disclosure teaches a new automatic test pattern generator 

for generating test patterns for testing circuits. This disclosure further 

teaches new serial and parallel scan based 3-weight Weighted Random 

Built-in Self-test (BIST) Architectures that have low hardware 

10 overhead. 

LB. Background and Related Art 
LB.l. Introduction 

Built-in self-test is a test technique that gives circuits the ability 
15 to test themselves. Test vectors (also called test patterns), comprising 
of a set of inputs to the circuit, are applied to the circuit under test 
(CUT). The responses of the CUT to the applied test vectors are 
compared with expected responses that correspond to a good circuit. A 
test pattern generator (TPG) generates test patterns. 



BIST can be classified into test-per-clock (parallel) and test-per- 
scan (scan based) according to the way in which the test vectors are 
applied to the CUT. In test-per-clock BIST, the test patterns that are 
output from TPG are directly connected to the inputs of a CUT. A new 
test pattern is applied at every test clock cycle. In contrast, in scan- 
based BIST, the test patterns generated by a TPG are applied to the 
CUT through a scan chain that comprises flip-flops. Therefore, in scan- 
based BIST, a test pattern is applied to a CUT every m+1 cycles, 
where m is the number of flip-flops in the scan chain. 

Random test pattern generators generate test patterns 
randomly. Test patterns can also be generated a priori and stored in a 
BIST for use during testing. In contrast to built-in self-test 

implemented with stored pattern generators that require high 
hardware overhead due to memory required to store deterministic test 
patterns, BIST implemented with pseudo-random pattern generators, 
require very little hardware overhead. Linear feedback shift registers 
(LFSR's) and cellular automata (CA) are two commonly used pseudo- 
random pattern generators. 

Furthermore, it has been observed that random pattern test 
sequences generated by LFSR's or CAR's achieve higher coverages of 
unmodeled faults than stored test patterns. Additionally, the number 
of stored test patterns is typically very short due to stringent memory 



size constraints. However, some circuits require prohibitively long 
sequences of random patterns to achieve satisfactory fault coverage. 

The random pattern test length required to achieve high fault 
coverage is often determined by only a few hard-to-detect faults. 
5 These hard-to-detect faults are also called random pattern resistant 
faults because they escape detection by most random patterns. Each 
of these random pattern resistant faults has very low detection 
probability, which is defined as the probability that a randomly 
generated vector detects the fault. See P. H. B., W. H. McAnney, and 
10 J. Savir. Built-in Test for VLSI: Pseudorandom Techniques, John Wiley 
& Sons, 1987. The detection probability of fault f can be defined as 
follows: 



total number of t ests for f ^ 



where m is the number of circuit inputs. 
15 This implies that the hard-to-detect faults have many necessary 

input assignments that must be made for their detection. Hence, the 

probability that a randomly generated vector satisfies all the necessary 

input conditions is low. 

Certain practitioners have proposed BIST techniques to improve 
20 detection probabilities of hard-to-detect faults that can be classified as 

extreme cases of conventional weighted random pattern testing 
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(WRPT) BIST. See S. Pateras and J. Rajski, Cube-Contained Random 
Patterns and Their Application to the Complete Testing of Synthesized 
Multi-level Circuits, in Proceedings IEEE International Test Conference, 
pages 473-482, 1991; I. Pomeranz and S. Reddy, 3-Weight Pseudo- 
Random Test Generation Based on a Deterministic Test Set for 
Combinational and Sequential Circuits, IEEE Trans. On Computer- 
Aided Design of Integrated Circuit and System, Vol. 12:1050-1058, 
July 1993; and M. F. AlShaibi and C. R. Kime, Fixed-Biased 
Pseudorandom Built-in Self-Test For Random Pattern Resistant 
Circuits, in Proceedings IEEE International Test Conference, pages 
929-938, 1994. 

For further background information on (WRPT) BIST, see D. 
Neebel and C. R. Kime, Multiple Weighted Cellular Automata, in 
VLSITS, pages 81-86, 1994; H.-J. Wunderlich, Multiple Distributions 
for Biased Random Test Patterns, in Proceedings IEEE International 
Test Conference, pages 236-244, 1988; and R. Kapur, S. Patil, T. J. 
Snethen, and T. W. Williams, Design of an Efficient Weighted Random 
Pattern Generation System, in Proceedings IEEE International Test 
Conference, pages 491-500, 1994. 

In the discussed techniques to improve detection probabilities of 
hard-to-detect faults, only three weights, 0, 0.5, 1, are assigned to 
each input, while various weights, e.g. 0, 0.25, 0.5 0.75, 1.0, can be 



assigned in conventional WRPT. Since only three weights are used, 
the circuit to generate weights is simple; weight 1 (0) is obtained by 
fixing a signal to a 1. Likewise, weight 0 is obtained by fixing a signal 
to a 0. Weight 0.5 is obtained by driving a signal by an output of a 
5 pure random pattern generator, such as an LFSR. Furthermore, since 
only three weights are used, the size of memory required to store a 
weight set is also smaller than that of the conventional WRPT. 
€t The present disclosure teaches techniques that use an improved 

O automatic test pattern generator (ATPG) to minimize hardware 

^ 10 overhead and test sequence length in 3-weight WRPT. In addition, 
y this disclosure also teaches at least three improved BIST architectures 

ui that are implemented using the disclosed techniques for ATPG. 

jg LB.2. Notations and Definitions 

^ In the present disclosure, the notations used in S. Pateras et al, 

15 are reused with few modifications. For detailed information on these 
notations, see S. Pateras and J. Rajski, Cube-Contained Random 
Patterns and Their Application to the Complete Testing of Synthesized 
Multi-level Circuits, In Proceedings IEEE International Test Conference, 
pages 473-482, 1991. 
20 A testcube for a fault is defined as a test that has unspecified 

inputs. Let C = {c 1 , c 2 ,. . . , c h } denote a set of testcubes for hard-to- 
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detect faults in a circuit under test (CUT), where d = ? c^-->CJ is 
an m-bit testcube and where m is the number inputs of the CUT. q. e 
{0, 1, X}V/ and X is don't care. Let d, C 2 , C 3 be subsets of C and 
let generator, gen(C) {g{,g i 2 ,..;g i m } V = 1/ 2, d) denote m-bit 
tuple where g[ e {0, 1, X, U} (k = 1, 2, ... , m) that is defined as 
follows: 

1 if 4 = 1 or X Vc 7 € C l and at least one 4 — 1 

f J 0 if 4 = 0 or X Ve 3 6 C 1 and at least one 4 = 0 (2) 

17 if 3cJ = 1 and 4 = 0, where ^c 6 G C 1 
[ X otherwise. 

It should be noted that the term generator is the same as 
weight sets used in D. Neebel and C. R. Kime. Multiple Weighted 
Cellular Automata, in VLSITS, pages 81-86, 1994; H.-J. Wunderlich. 
Multiple Distributions for Biased Random Test Patterns, in Proceedings 
IEEE International Test Conference, pages 236-244, 1988; and R. 
Kapur, S. Patil, T. J. Snethen, and T. W. Williams. Design of an 
Efficient Weighted Random Pattern Generation System, in Proceedings 
IEEE International Test Conference, pages 491-500, 1994, 

When g[ = U, there are testcubes in C that conflict at input p k , 

i.e. 3a, b such that c\ =v and c\ =V (v e {1,0}) so that fixing input 
p k to a binary value v may make faults that require input p k to be 



assigned v undetectable. On the other hand, when g[ = X, input p k is 

not specified in any testcube in Q. Hence, in this case, input p k can be 
fixed to any binary value without losing fault coverage. The set of 
inputs that are assigned binary values in testcube d but assigned U in 
5 the corresponding position in the generator gen(C) are called 
conflicting bits of cK Figure 1, shows example testcubes. In the set of 
testcubes shown in Figure 1 (a), c 1 has 3 conflicting bits since inputs 
p lf p 3f and p 6 are assigned different binary values (0 or 1) in c 1 - c 4 . 
Therefore these pins are assigned U in gen(C). On the other hand, c 2 
10 has only one conflicting bit since only p 3 among the inputs whose 
corresponding bits are assigned U in gen(C) is assigned a binary value 
0 in c 2 . 

a) Detection Probabilities 

15 Let F = {f f fj ... f 1 } be the set of faults and C = {c 1 , c 2 ,... <?} 

be the set of testcubes. Assume that testcube c p (j = 1, 2,... n) is the 
only testcube for the corresponding fault, f (j = 1,2,. ..n). Under this 

assumption, the detection probability of fault f is merely 1/2*"'*', 
where m is the number of inputs and |X J | is the number of inputs in 
20 testcube d that are assigned don't care, X. 
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Let testcubes c 1 , c 2 , c 3 , and c 4 shown in Figure 1 (a) be the only 
testcubes corresponding to faults f, f f f f and f respectively. 
Testcubes c lf c 2f c 3f and c 4 have 2, 2, 3, and 1 don't care values, 
respectively. Since each testcube is the only testcube for the 
5 corresponding fault, the detection probabilities of f t f 2 , f f and f can 
be respectively computed as 1/2 4 , 1/2 4 , 1/2 3 , and 1/2 5 (note that the 
number of inputs, m, is 6 in this example). 

As explained in Equation 2, if input p k is assigned either 1 or X 
in all testcubes in C and assigned 1 in at least one testcube, then it is 

10 assigned weight 1, i.e. g k = 1. Likewise, if input p k is assigned either 
0 or X in all testcubes in C and assigned 0 in at least one testcube, 
then it is assigned weight 0, i.e. g k = 0. On the other hand, if g k is 
assigned a 1, then input p k can be fixed to a 1 to improve, by factor of 
2, the detection probability of all faults in F that require the application 

15 of a 1 at p k for their detection. Similarly, if g k is assigned a 0, then 
input p k can be fixed to a 0 to improve, by factor of 2, the detection 
probability of all faults in Fthat require the application of a 0 at p k for 
their detection. 

Since none of faults in F require the assignment of a 0 at p k for 
20 their detection, fixing p k to a 1 does not make any faults untestable. 
Similarly, since none of faults in F require the assignment of a 1 at p k 
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for their detection, fixing p k to a 0 does not make any faults 
untestable. 

In Figure 1 (a), if p 2 is fixed to a 0, the detection probability of 
faults f and f both of which require the assignment of a 0 at p 2 for 
5 their detection, increases to 1/2 3 and 1/2 4 , respectively. In the same 
fashion, p 4 and p 5 can be fixed to a 1 and a 0, respectively. On the 
other hand, fixing other inputs p lf p 3f and p 6 to a binary value make 
some faults untestable since testcubes for these faults have conflicting 
values at these inputs. For example, fixing p t to a 0 makes faults f 3 
10 and f 4 untestable since they require the assignment of a 1 at p 2 for 
their detection. 

If the four testcubes c 1 , c 2 , c 3 , and c 4 are partitioned into two 
groups C 1 = {c 1 , c 2 } and C 2 = {c 3 ,*: 4 }, inputs pi f p 2f P4,Ps f and p 6 in C 1 
can respectively be fixed to 0, 0, 1, 0, and 1 and inputs Pi,P2,P3, and 

15 p 5 in C 2 to 1, 0, 0, and 0 without any conflict. Since p 4 is assigned X in 
all testcubes in C 2 (c 3 and c 4 ), g\ is assigned X in gen{C 2 ). As a 
consequence, geniC 1 ) - {0, 0, U, 1 A 0,1,} and gen{C 2 ) = 
{1,0,0,X,0,U}. The detection probabilities off 1 and f are increased to 
1/2 by fixing inputs Pi,P2,P4,P5, and p 6 to 0, 0, 1, 0, and 1, and the 

20 detection probabilities of f and f also are increased to 1/2 by fixing 
inputs Pi,P2,P3f and p 5 to 1, 0, 0, and 0. In order to apply two 
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generators ge/7(C L ) and gen{C 2 ), inputs p l/ p 2/ p 4/ and p 5 are fixed to 0, 
0, 1, and 0 for the first time interval T x and then inputs Pi,p 2f p 3! and 
p s are fixed to 1, 0, 0, and 0 for the next time interval T 2 . The 
procedure to compute these time intervals is described in Section IVC. 

b) Importance of Generating A ppropriate 
Testcubes 

Assume that f in Figure 1 (b) can be also detected by testcube 
c' 1 = 0111XX and c n is used instead of c 1 to compose testcube set C' 1 . 
Then, the 1 at p 2 of c' 1 conflicts with 0's at p 2 of the other testcube c 2 
in C' 1 . Hence, only p lf p 4/ p 5f and p 6 can be fixed without conflict in 
C' 1 . Since only p lf p 4f p 5/ and p 6 are fixed, the detection probabilities 
of f- f which require 0 at p 2 for detection, now become 1/2 2 — a factor 
of 2 decrease compared with when c 1 is used. If the detection 
probabilities of all faults in Fare to be improved to 1/2 or higher, then 
C' 1 should be partitioned into two smaller groups each of which has 
only one member, c' 1 and c 2 , respectively. Hence, three generators 
instead of two generators, which are required when c 1 is used, are 
required when c' 1 is used. 

In order to minimize the number of generators required, which 
determines hardware overhead to implement the disclosed BIST, each 
testcube that is added to a testcube set should be selected carefully. 
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The newly added testcube needs to be selected such that it has the 
minimum number of conflicting inputs with the other testcubes that 
already exist in the testcube set. However, it is known that the 
number of all possible testcubes for hard-to-detect faults may be large 
in large random pattern resistant circuits. Therefore, selecting a best 
testcube, which has the minimum number of conflicting inputs with the 
other testcubes in the testcube set, from a pool of all possible 
testcubes for each hard-to-detect fault will require prohibitively large 
time complexity. Hence, for such circuits, the number of generators 
required may depend significantly on testcubes selected to compose 
the testcube sets. This implies that testcubes for hard-to-detect faults 
should be carefully generated to build testcube sets that need the 
minimum number of generators. 

c) A pplying Generators to Circuit Inputs 

In this sub-section, the idea of applying generators to the CUT is 
explained for the test-per-clock BIST. This information related to the 
test-per-clock BIST is provided as a background for a better 
understanding of the disclosed techniques related to the test-per-scan 
BIST, which is discussed in detail in Section IVD. Let inputs p k (k = 
1,2,. . . ,m) be driven by corresponding random pattern signals r k (k = 
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1, 2, ... , m) during purely random pattern testing. While generator 
gen(C) is applied, input p k with g\ = v is fixed to a binary value v by 

overriding random pattern signal r k by setting the corresponding 
overriding signal s k/Vf to a 1. The overriding signal(s) for input p k is 
5 determined by values of g[ (i = 1, 2,... f d f where d is the number of 

generators). For instance, if g[ (i = 1,2,... f d) are always assigned X 
O or U in all d generators, input p k is not fixed in any generator and 

^ hence no overriding signal is required for p k . If g[ are assigned a 1 or 

m X and assigned a 1 in at least one generator, input p k should be fixed 

IK Sir 

yS : 10 to a 1 while generator gen(C) in which g\ - 1 is applied. Similarly, if 
f?l g\ are assigned a 0 or X and assigned a 0 in at least one generator, 

Ul input p k should be fixed to a 0 while generator gen{C) in which g[ - 0 

is applied. If g k are assigned a 1 in gen{C) (i.e. g\ - 1) and 0 in 
gen(d°) (i.e. g b k = 0) input p k should be fixed to a 1 while generator 
15 genCC 3 ) with g\ = 1 is applied and to a 0 while generator gen(C b ) with 
g\ - 0 is applied. 

To express the number of overriding signals and their overriding 
values for each input, glob_gen = {ggi, ggi, gg m }, where m is the 
number of inputs and gg k (k = 1,2, ... ,m) is introduced as follows: 
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' 1 if g' k = 1, U or X Vi = 1, 2, . . . , d and at least one g{ = 1 

_]o if^ = 0,D r orXVi = l,2,...,rfandatleastone4 = 0 ^ 

m ~ | B if 3$ = 1 and g b k = 0 

I jV otherwise 

where d is the number of generators. 

When gg k = v (v e {1, 0}), only one overriding signal s k/v is 
required for input p k and when gg k = B, two overriding signals s v/ and 
s k/0 are required. Finally, when gg k = N, all (i = 1,2,... , d) are X's 
or U's, and input p k is always driven by random pattern signal r k during 
entire test application and hence no overriding signal is required for 

Pk- 

Figure 2 shows the generators and overriding signals for a 
circuit with 5 inputs. Since input pi is fixed to both a 1 (g°) and (g] 
and g\), ggi is assigned B and two overriding signals Si /0 and s m are 
assigned to input pi. On the other hand, since input p 3 is fixed to only 
a 1 {g\ and gl ), gg 3 is assigned a 1 and only one overriding signal s 3/ i 
is assigned to p 3 . Finally, since g' 5 is assigned no binary value in any 
of generators gen(C) (i = 1,2,3,4), input p 5 is always driven by 
random pattern signal r 5 without being fixed. Hence, gg 5 is assigned N 
and no overriding signal is assigned to p 5 . In conclusion, since two 
inputs, pi and p 4 , are assigned gg k = B {k = 1,4) and input p 2 and p 3 
are assigned gg 2 = 0 and gg 3 = 1, respectively. The generators shown 

13 



typically determines the area required by DECODER. Hence, the 
number of overriding signals required to fix inputs as well as the 
number of generators determines the hardware overhead due to 3- 
weight WRPT. 

5 The procedure described for the test-per-clock BIST are also 

applicable to the test-per-scan BIST (scan-based BIST) with little 
modification. Techniques to alter bit sequences that are scanned into 
a scan chain to detect hard-to-detect faults are proposed in, H.-J. 
CI Wunderlich and G. Kiefer. Bit-Flipping BIST,in Proceedings VLSI 

01 10 Testing Symposium, pages 337-343, 1996; and N. A. Touba and E. J. 
© McCluskey, Altering a Pseudo-Random Bit Sequence for Scan-Based 

^ BIST, in Proceedings IEEE International Test Conference, pages 167- 

\2 175, 1996 (Touba'96). Touba'96 uses the similar procedure that is 

J used in Touba'95 to compute the mapping function that converts the 

15 complement of tests that detect any new faults into the tests for 
undetected faults. See N. A. Touba and E. J. McCluskey, Altering a 
Pseudo-Random Bit Sequence for Scan-Based BIST, In Proceedings 
IEEE International Test Conference, pages 167-175, 1996; and N. 
Touba and E. McCluskey, Synthesis of Mapping Logic for Generating 
20 Trans-formed Pseudo-Random Patterns for BIST, In Proceedings IEEE 
International Test Conference, pages 674-682, 1995. Since the 
procedure is highly dependent on the test sequence that is applied to 
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the CUT, sometimes the complement of such tests can be tautology or 
in the other extreme case, the complement can have significant 
number of specified inputs leading to high hardware overhead. 
Furthermore, the whole BIST circuitry that alters the bit-sequence 
5 should be re-designed if the test sequence is changed. Karkala uses 
the same techniques as Touba'96 to convert pseudo-random 
sequences that do not detect any new faults to deterministic 
testcubes; See M. Karkala, N A. Touba, and H.-J. Wunderlich, Special 
ATPG to Correlate Test Patterns for Low-Overhead Mixed-Mode BIST, 

10 in proceedings 7yrd Asian Test Symposium, 1998; and IM. A. Touba and 
E. J. McCluskey, Altering a Pseudo-Random Bit Sequence for Scan- 
Based BIST, in Proceedings IEEE International Test Conference, pages 
167-175, 1996, However, unlike Touba'95, where deterministic 
testcubes are generated by a conventional ATPG procedure, in Karkala, 

15 deterministic testcubes are generated by a special ATPG that considers 
correlation among the deterministic BIST. See N.A.Touba and E. 
McCluskey, Synthesis of Mapping Logic for Generating Trans-formed 
Pseudo-Random Patterns for BIST, in Proceedings IEEE International 
Test Conference, pages 674-682, 1995; and M. Karkala, N A. Touba, 

20 and H.-j. Wunderlich, Special ATPG to Correlate Test Patterns for Low- 
Overhead Mixed-Mode BIST, In proceedings 7yrd Asian Test 
Symposium, 1998. H. -J. Wunderlich uses a procedure that is similar 
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to Touba'96 but alters random patterns to deterministic pattern by 
flipping some bits of the random patterns. See H.-J. Wunderlich and 
G. Kiefer, Bit- Flipping BIST, in Proceedings VLSI Testing Symposium, 
pages 337-343, 1996; N. A. Touba and E. J. McCluskey, Altering a 
5 Pseudo-Random Bit Sequence for Scan-Based BIST, in Proceedings 
IEEE International Test Conference, pages 167-175, 1996. This 
technique is also highly dependent on the random pattern test 
sequence applied. 

I.B.3. Conventional ATPG for a 3-weight WRPT BIST 
10 In Pateras, a complete set of testcubes is generated for two- 

level version of synthesized circuits (the simple structure of two-level 
circuits makes generation of all tests possible). See S. Pateras and J. 
Rajski, Cube-Contained Random Patterns and Their Application to the 
Complete Testing of Synthesized Multi-level Circuits, in Proceedings 
15 IEEE International Test Conference, pages 473-482, 1991. These 
testcubes are partitioned into several sets such that the number of 
conflicting inputs of any testcube in each set is less than or equal to M 
and a generator is computed for each of the sets. Since the number of 
generators depends on testcubes in the testcube sets, as described in 
20 LB. 2(b), testcubes that have many necessary assignments are filtered 
out to reduce the number of generators required. This is because, if 
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testcubes in the set have many necessary assignments, there will be 
more conflicting inputs and hence fewer testcubes can be placed in 
each testcube set leading to more generators. Hence testcubes that 
have fewer necessary assignments can be included in the testcube 
sets to be used to create generators. However, the testcubes for two- 
level circuits may contain testcubes that are not necessary for their 
multi-level synthesized circuits. 

According to the experimental results reported in S. Pateras, 
this filtering procedure requires very high time complexity (the run 
time of this procedure is typically higher than that of synthesis 
procedure). See S. Pateras and J. Rajski, Cube-Contained Random 
Patterns and Their Application to the Complete Testing of Synthesized 
Multi-level Circuits, in Proceedings IEEE International Test Conference, 
pages 473-482, 1991. However, this procedure is believed to be 
necessary in this method to choose good testcubes (testcubes that has 
few necessary assignments) from complete set of testcubes. Since 
this method requires a complete set of testcubes, applying this method 
to large circuits, whose two-level versions are not available, may not 
be possible due to prohibitively high time complexity. 

As is clear, the present teaching is aimed at overcoming the 
problems described above in general and providing improved 
techniques. 
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IL Summary of the Claimed Invention 

In the present disclosure, an improved automatic test pattern 
generator (ATPG) that can generate sets of testcubes that are suitable 
5 for 3-weight WRPT, is disclosed. In the disclosed techniques, 
testcubes in each set share the most number of common necessary 
input assignments, thereby generating minimal number of generators. 
The disclosed, improved ATPG also considers reducing overriding 
Cl signals required to fix inputs according to generators as well as the 

hi 10 number generators. The number of overriding signals is further 
minimized by compatibility analysis. 

^ To realize the advantages of the teachings of the present 

^ disclosure, there is provided a method for generating a test set for 

'%l hard to detect faults. The method comprises identifying a set of hard 

15 to detect faults and generating the test set for the hard to detect faults 
by using an automatic test pattern generator. The automatic test 
pattern generator comprises functionality for generators and a global 
generator and is adapted to consider hardware overhead and test 
sequence lengths. The hardware overheads are incurred when each 
20 new testcube is added to the test set. 

Preferably, the test set is generated by a process further 
comprising calculating estimated cost using cost functions for each of 
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the hard to detect faults, selecting a hitherto unmarked target fault 
from the set of hard to detect faults that has a minimum cost, 
generating a testcube for the selected target fault, and comparing real 
cost with estimated cost for the selected hard fault. If the real cost is 
greater than a sum total of the estimated cost plus a predetermined 
error and if there are still unselected faults in the set of hard to detect 
faults, the process is replaced with a new unselected fault. However, if 
no unselected faults remain in the set of hard to detect faults, a 
testcube having a minimum real cost is selected. If real cost is not 
greater than a sum total of the estimated cost plus the predetermined 
error, the test cube generated is retained. Marking all faults detected 
by the selected test cube and 

adding the selected test cube into a current test set are marked and 
current generator is updated. 

Still preferably, the cost functions comprises of controllability 
costs, observability costs and test generation costs. 

Still preferably, the number of specified inputs for the testcube 
are minimized by bit stripping. 

Still preferably, the controllability cost for an input is calculated 
based on the following formulae: 
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Cv{p k ) = < 



0 ]fgi = U 

0 if gl = v 

w if gi = v (where w 1) 

1 if g{ = X and gg k = £ 
1 if g* k = X and ^ = v 

/i if ^ = X and ^ = v (where /i > 1) 

h \ig\ = X and 5^ = N 



where v is a binary value, 0 or 1, X is a don't care input, 

Cv (p k ) represents cost for an input p k/ 

g k is an input in the current generator, and 

gg« represents an input in the global generator, and wherein 

the controllability cost of each input is used to estimate a 
number of input conflicts and overriding signals that would be created 
by setting a line to a binary value v. 

Still preferably, the controllability costs for internal lines is 
calculated based on the following formulae: 

a controllability cost for an internal circuit line / in the circuit, is 



f min la {Cc(U)} if v = c@ t 
W ~ t ZuCc(l a ) otherwise, 



where / a and / are respectively the inputs and the output of a 
gate with controlling value c and inversion i. 

Still preferably, the test generation cost is a sum of cost to 
activate a specific fault on a line and a cost of propagating the fault 
through the line. 
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Another aspect of the claimed invention is a method of 
generating test sets for a fault list comprising hard to detect faults, the 
method comprising: 

initializing i <- 0 and glob_gen *-{N,N,...,N);. 

initializing a current testcube set, C 

unmarking all faults in the fault list; 

initializing a current generator, gen( a ) = {X,X,....,X} and j ^- 0; 

if there are no more faults in the fault list, then proceeding to 
the WRPT generation step; 

generating a testcube d using an ATPG; 
adding the testcube d to a current testcube set, C <- C U c J ; 

marking faults detected by the testcube d; 

setting j <— j + 1; 

if the number of conflicting inputs of any testcube in C is 
greater than a positive integer M then C <- C 1 - d, i <- i + 1, updating 
the global generator, and returning to the step where a new generator 
is created; 

if the number of conflicting inputs of any testcube in C is not 
greater than M , updating gen(C) and returning proceeding to the step 
where a new fault is considered; 
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generating 3-weight weighted random pattern testing (WRPT) 
patterns by fixing inputs or applying pure random patterns to inputs 
according to gen(C); and 

running fault simulation to drop the faults that are detected by 
the generated 3-weight WRPT patterns. 

Preferably, the method further comprises merging compatible 
overriding signals (parallel type test-per-scan) or reordering scan 
chain (serial type test-per-scan). 

Another aspect of the claimed invention, is a parallel type test 
per scan built-in self test circuit comprising a circuit under test 
comprising inputs. A set of scan flip flops are connected to the inputs. 
Each of the scan flip flops have at least a synchronous reset (R) or a 
synchronous preset (S) pin. A LFSR is provided for loading random 
vectors that provide input to the set of scan flip-flops. A decoder 
provides decoder outputs, wherein the decoder outputs control the R 
and S pins in the scan flip-flops. The decoder comprises a functionality 
of a global generator. A counter provides inputs to the decoder that 
determine a state of the decoder outputs. An enable provides inputs 
for the decoder.The decoder provides overriding signals to inputs of 
the circuit by controlling the input to the R and S pins. The overriding 
signals override the random vectors based on inputs in a generator for 
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test patterns for hard faults, said tests being generated by an 
automatic test pattern generator. 

Preferably, the enable is provided by an AND gate, that 
performs an AND operation on a override enable input signal and a last 
scan input signal. 

Still preferably, random patterns are overridden and the BIST 
enabled by providing a 1 input to the override enable input signal. 

Still preferably, the last scan input signal is set to a 1 only at a 
last cycle of each scan shifting operation. 

Preferably, the counter is adapted to be set to 0 initially and 
maintained at 0 while a specific number (T) of random patterns are 
input, said random patterns being modified based on the generator 
provided by the decoder, said counter being further adapted to be 
incremented and T random patterns applied with a new generator, said 
increment being repeated until all generators have been applied by the 
decoder. 

Preferably, compatible overriding signals that can be merged 
are driven by a same output signal of the decoder, thereby reducing a 
number of decoder outputs. 

Preferably, inversely compatible overriding signals are driven by 
a decoder output directly and the same decoder output after passing 
through an invertor. 
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Preferably, if a scan input is assigned a 1 by the global 
generator, then the corresponding scan flip-flop has an S pin. 

Preferably, if a scan input is assigned a 0 by the global 
generator, then the corresponding scan flip-flop has an R pin. 

Preferably, if a scan input is assigned both a 0 and 1 by the 
global generator, then the corresponding scan flip-flop has both R and 
S pins. 

Preferably, if a scan flip-flop already has a high active S pin and 
whose corresponding scan input is assigned a 1 in the global 
generator, then a two input OR gate is inserted between the S pin and 
a normal present signal. 

Preferably, if a scan flip-flop already has a high active R pin and 
whose corresponding scan input is assigned a 0 in the global 
generator, then a two input OR gate is inserted between the S pin and 
a normal present signal. 

Preferably if a scan flip-flop already has a low active S pin and 
whose corresponding scan input is assigned a 1 in the global 
generator, then a two input AND gate is inserted between the S pin 
and a normal present signal. 

Preferably, if a scan flip-flop already has a low active R pin and 
whose corresponding scan input is assigned a 0 in the global 
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generator, then a two input AND gate is inserted between the S pin 
and a normal present signal. 

Another aspect of the claimed invention is a serial type test per 
scan built-in self-test circuit comprising a circuit under test comprising 
5 inputs. A set of scan flip flops are connected to the inputs. A LFSR is 
provided for loading random vectors that provide input to the set of 
scan flip-flops. An AND gate and an OR gate inserted between said 
LFSR and the set of scan flip flops. A decoder providestwo decoder 
output signals D 0 and D. The decoder output signals is input to the 

10 AND and OR gates. A generator counter that selects a generator 

provides inputs to the decoder. A scan counter provides input to the 
decoder. The state of the decoder outputs are together determined by 
the counter input and the scan counter input. An enable input is 
provided for the decoder. The decoder provides overriding signals, that 

15 override the random vectors based on tests in a generator for test 
patterns for hard faults. The tests are generated by an automatic test 
pattern generator. 

Preferably, area overhead of the decoder is reduced by inserting 
toggle flip-flops between the two outputs of the decoder and inputs to 

20 the AND and OR gates. 

Preferably, inputs from the random vector corresponding to 
conflicting inputs in the generator are not overridden. 
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Preferably, the scan_counter is adapted to increase by 1 at 
every positive edge of a scan clock and new values for scan inputs are 
scanned in serially until inputs corresponding to all pins have been 
scanned in. If a value in a generator is a 1, a 1 is scanned in as a scan 
input in a corresponding pin, instead of a value provided by the 
random vector. If a value in the generator is a 0 7 a 0 is scanned in as 
a scan input in a corresponding pin, instead of a value provided by the 
random vector. If a value in the generator is a don't care or a 
conflicting value, then a value in the random vector is scanned in. 

Still preferably, an order of scanning is determined by using 
genetic algorithms, wherein a permutation of scan elements in the 
scan chain is used as a genetic code, and the genetic algorithm is used 
to determine an order of scan elements that leads to a minimum 
number of minterms. 

Still preferably, compatible overriding signals are merged prior 
to applying genetic algorithms. 

Still preferably, scan inputs in a group of compatible scans are 
rearranged to satisfy routing or load capacity. 

III. Brief Description of the Drawings 
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The disclosed teachings and techniques is described in more 
details using preferred embodiments thereof with reference to the 
attached drawings in which: 

Figure, 1 shows examples of testcubes. 

Figure. 2 shows the generators and overriding signals for a 
circuit with 5 inputs. 

Figure. 3 shows an implementation of 3-weight WRPT circuitry 
for a test-per-clock BIST corresponding to Figure 2. 

Figure. 4 shows a flowchart illustrating the disclosed improved 
ATPG technique. 

Figure. 5 shows a Test-per-clock BIST implementation with 
merged overriding signals 

Figure. 6 shows a preferred embodiment of the disclosed 
parallel type test-per-scan BIST circuit with 3-weight WRPT. 

Figure. 7 shows a preferred embodiment of the disclosed serial 
type test-per-scan BIST circuit with 3-weight WRPT. 

Figure. 8 shows minimization of the number of minterms using 
toggles. 

Figure. 9 shows a serial type test-per-scan BIST circuit with 
toggle flip-flops. 
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IV. Detailed Description of the Preferred Embodiments 
IV.A. Improved ATPG for 3-weight WRPTBIST 

In the disclosed improved ATPG for 3-weight WRPT BIST, first, a 
sequence of pure pseudo-random patterns is applied to detect a large 
number of random pattern testable faults. It should be noted that 
random pattern testable faults refer to faults that are not random 
pattern resistant. The sequence of pure random patterns can be 
considered as a special generator all of whose elements are assigned 
U. In contrast to S. Pateras et al, L Pomeranz et al, and M. F. AlShaibi 
et al f where all testcubes are generated in advance, the disclosed 
improved ATPG generates suitable testcubes for hard-to-detect faults 
one at a time. See S. Pateras and J. Rajski, Cube-Contained Random 
Patterns and Their Application to the Complete Testing of Synthesized 
Multi-level Circuits, In Proceedings IEEE International Test 
Conference, pages 473-482, 1991; I. Pomeranz and S. Reddy, 3- 
Weight Pseudo-Random Test Generation Based on a Deterministic Test 
Set for Combinational and Sequential Circuits. IEEE Trans. On 
Computer-Aided Design of Integrated Circuit and System, Vol. 
12:1050-1058, July 1993; M. F. AlShaibi and C. R. Kime. Fixed-Biased 
Pseudorandom Built-in Self-Test For Random Pattern Resistant 
Circuits, in Proceedings IEEE International Test Conference, pages 
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929-938, 1994; and M. F. AlShaibi and C. R. Kime. MFBIST: A BIST 
Method for Random Pattern Resistant Circuits, in Proceedings IEEE 
International Test Conference, pages 176-185, 1996. 

Since only suitable testcubes are generated, procedure to filter 
out testcubes that have many necessary assignments is not required 
in the disclosed technique, Generated testcubes are placed into 
testcube set C, called current testcube set, until placing any testcube 
into C makes the detection probability of a fault lower than a 
predefined threshold 1/2 M , i.e. the number of conflicting inputs of the 
testcube in C that has the most number of conflicting inputs becomes 
greater than M. Upon the generation of a testcube set C, a new 
current testcube set C +1 is created into which the testcubes generated 
later are placed. Whenever a testcube is placed into C, generator 
gen(C) is updated according to Equation 2. In order to place as many 
testcubes into each testcube set as possible (to minimize the number 
of generators) each testcube is generated by the disclosed improved 
ATPG, taking all testcubes in the current testcube set into 
consideration. It should be noted that the disclosed technique can be 
considered to be an improvement over PODEM, a conventional ATPG 
technique. For more details on PODEM, see P. Goel, An Implicit 
Enumeration Algorithm to Generate Tests for Combinational Logic 
Circuits, IEEE Trans, on Computers, Vol. C-30(3), March 1981. 
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As mentioned in I. B. 2(c), the hardware overhead for 
implementing 3-weight WRPT BIST is determined by not only the 
number of generators but also the number of overriding signals that 
are expressed by glob _gen. Hence, in order to minimize hardware 
overhead, the disclosed technique attempts to minimize the number of 
overriding signals. The global generator glob_gen is updated after 
each testcube set, C, is finalized, i.e. when adding any testcube to C 
makes the number of conflicting inputs of any testcube greater than 
the predefined M. The disclosed improved ATPG also minimizes the 
number of overriding signals by taking current g!ob_gen into 
consideration. 

Controllability, observability, and test generation cost are 
defined to guide the disclosed improved ATPG to generate a testcube 
that has the smallest number of conflicting inputs with testcubes in the 
current testcube set and that requires the smallest number of 
overriding signals. 

The controllability cost of each line is used to guide the disclosed 
improved ATPG when there are more than one possible backtrace 
paths for line justification. The controllability cost of input p kf CV(p k ), 
is defined by considering the generators gen(C) of the current 
testcube set and the current g!ob_gen as follows: 
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Cv( Pk ) = 



f 0 if 9l = U 

0 ifgl = v 

w if g\ = v (where w >> 1) 

1 ]£gl = X and gg k = B (4) 
1 tfgl = X and - v 

h if (ft = X and = v (where /i > 1) 

/i if g\ — X and = N 



where v is a binary value A 0 or 1. 

The purpose of the controllability cost of each input is to 
estimate the number of input conflicts and overriding signals that 
would be created by setting line / to a binary value v. If g k = U f the 
current testcube set already contains testcubes that conflict at input 
p k -' Hence, assigning any binary value to p k does not cause any more 
adverse effect. Hence, Cv(p k ) = 0. When g k = 1 (0), a testcube 
whose input p k is assigned a 1 (0) does not cause conflict with any 
testcube in the current testcube set. Hence, Cv(p k ) = 0. If g k = 1, all 
testcubes in the current testcube set are assigned only 1 or X at input 
pk. Likewise, if g k = 0, all testcubes in the current testcube set are 
assigned only 0 or X at input pk. Hence, a testcube whose input p k is 
assigned to the opposite value 0 (1) causes a conflict at input p k with 
other testcubes in the current testcube set. Hence, this assignment 
clearly causes a conflict and hence high cost CV(p k ) = w is assigned. 

A new overriding signal can be added to the BIST TPG when 
input p k whose g[ is currently X (hence p k is not specified in any 
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testcube in the current testcube set) is assigned a binary value. 
Assigning input p k , whose g[ is currently X to a binary value v, 
changes the current value of g k X to a binary value v. This can require 
an additional overriding signal to the BIST depending on the current 
value of gg k . If gg k is currently B, two overriding signals, s k/1 and s k/0/ 
are already assigned to input p k . Hence, assigning p k to any binary 
value does not require any additional overriding signal. In case gg k is 
currently assigned a binary value v, assigning the same value v to p k 
requires no additional overriding signal. Even though the two above 
mentioned cases do not require any additional overriding signal or 
cause any input conflict, assigning v to p k may conflict with a testcube 
generated later that assigns v to p k . Hence, assigning a binary value 
to p k is a potential cause of later conflict and assigned a small cost, 1. 
If currently gg k is N and g[ = X, any testcube that requires input p k to 
be assigned a binary value v, creates one more overriding signal, s k/v , 
for input p k . If currently gg k = v and g[ = X, any testcube that 
requires input p k to be assigned the binary value v changes the value 
of gg k to B adding one more overriding signal, s k/v , for input p k . For 
these two cases, controllability of the input is h, where h is a constant 
that reflects the cost for adding an overriding signal. 
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The controllability costs for internal circuit line 1 in the circuit, is 
computed as follows. 



where l a and 1 are respectively the inputs and the output of a gate 
with controlling value c and inversion i. It should be noted that, the 
above computation of controllability cost is similar to testability 
measures used in Goldstein et al. See L H. Goldstein and E. L. 
Thigpen. SCOAP: Sandia Controllability/Observability Analysis 
Program. In Proceedings IEEE- ACM Design Automation Conference, 
pages 190-196, 1980. 

Note that the controllability cost for circuit 1, Cv(/), is the 
minimum cost to set line / to v. In other words, if / can be set to v by 
setting only one of its input to binary value v', i.e., v = v' © i and v' is 
the controlling value of the gate that drives /, and input l min of / has the 
minimum cost function 0'(/min) among I's inputs, then Cv(\) is defined 
as Ci/(/ min ). However, if there are reconvergent fanouts in the circuit, 
then it may not be possible to set / to v by setting / mm to v' due to 
conflict with other objectives. In that case, the disclosed improved 
ATPG selects other input of 1, which has higher cost function than l min/ 
to satisfy the object of setting line / to v. Hence, the actual cost of 
setting line / to v can be greater, but cannot be less, than CV(/). 




(5) 
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During test generation, all gates whose output values are 
currently unknown and at least one of whose gate inputs has the fault 
effect belong to D-frontier. For more background information on D- 
frontier, See P. Goel. An Implicit Enumeration Algorithm to Generate 
Tests for Combinational Logic Circuits, IEEE Trans, on Computers, Vol. 
C-30(3), March 1981. In the disclosed improved ATPG, a gate that is 
likely to create a minimum number of conflicts and needs minimum 
amount of additional TPG hardware to propagate the fault effect at its 
input is selected from D-frontier repeatedly, until the fault effect 
reaches one or more primary outputs. 

The observability cost function described below serves as a 
selection criterion to achieve this objective. The observability cost 
functions are recursively computed from primary outputs to primary 
inputs. The observability cost of line / is given by 



where in the latter case l Q is the output of gate with input / and f a are 
all inputs of l 0 other than 1. The observability cost of line / is also the 
minimum cost to propagate a value at line / to one or more primary 
outputs. Hence, just like in the controllability cost, if the circuit has 
reconvergent fanouts, then the actual cost to propagate a value at line 



0(1) = 




if Z is a fanout stem with branches l 0 



(6) 
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/ to one or more primary outputs can be higher than, but not less than, 
the observability cost of line /, 0(/). 

The disclosed improved ATPG will now be described with 
reference to an example related to the stuck at fault model. In order 

to generate a testcube to detect a stuck-at-v (s-a-v) at line /, first the 

fault is be activated by setting line 1 to v. The cost to activate / s-a-v 
is Cv(l). Then, the activated fault effect is propagated to one of more 
primary output. The cost to propagate the activated fault effect at line 
1 is 0(1). Hence, the test generation cost to generate a testcube for / 
s-a-v is defined as the sum of two cost functions; 

7V(/) = Cv(l) + 0(1). (7) 
Since the controllability cost and observability cost are defined as the 
minimum possible costs, the test generation cost 7V(/), which is 
merely the sum of two cost functions, Cv(l) and 0(1), is also the 

minimum cost to generate a testcube to detect a stuck-at-v and hence 
the actual cost to generate a testcube to detect a stuck-at-v is always 
greater than or equal to its test generation cost function 7V(/). 
Since testcubes generated by the disclosed improved ATPG are often 
over specified, a few bits that are assigned binary values by the 
disclosed improved ATPG can be replaced by don't cares while 
ensuring the detection of the target fault. Testcubes with fewer 
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specified inputs have fewer conflicting inputs with testcubes already in 
the current testcube set so that more testcubes can be placed in a 
testcube set. Whenever a testcube is generated, inputs that are 
assigned binary values are ordered according to the- cost of assigning 
each input to its binary value. The value assigned to each of these 
inputs is flipped in this order. If the target fault can be still detected 
after an input is flipped, the value assigned to the inputs is replaced by 
a don't care. This procedure is called bit stripping. 

If a circuit has any reconvergent fanout, an input assignment 
required to satisfy some objectives may conflict with that required to 
satisfy other objectives, causing the disclosed improved ATPG to select 
an objective or backtrace path with high cost. Hence, in circuits with 
reconvergent fanouts, the actual cost of the testcube generated for a 
fault may be much higher than the cost of the fault given by the 
estimate test generation cost function shown in Equation 7. 

To prevent adding such testcubes to the current testcube set, if 
the actual cost of a generated testcube is higher by a certain number 
(say, 100) than the estimate test generation cost of the fault, the 
generated testcube is discarded . Test generation is then carried out 
for alternative target faults until a testcube is found for a fault whose 
actual cost is close to the estimate test generation cost of the fault or 
all faults in the fault list are tried. However, even in the worst case 
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where all faults in the fault list need to be tried, generating testcubes 
is required only for few faults, The estimate test generation cost of a 
fault is always an optimistic approximate of the actual cost for the fault 
in the sense that the actual cost of any testcube for the fault cannot be 
5 less than the estimate test generation cost. Hence, if the estimate 
test generation cost of a fault is greater than the actual cost of the 
testcube that has the minimum actual cost among the testcubes that 
have been generated to be added to the current set but discarded due 
to high actual cost, then the actual cost of any testcube for the fault 

10 cannot be less than the current minimum actual cost and hence 
generating a testcube for the fault is not required. If testcubes for all 
faults in the fault list have very high actual cost, then the testcube that 
has the minimum actual cost is chosen to be a new member of the 
current testcube set. 

15 A flowchart showing a preferred embodiment of the disclosed 

improved ATPG is shown in Figure 4. 

In 4.20 costs for all the hard-to-detect faults are computed. In 
4.30 a hitherto unselected target fault from the set of hard to detect 
faults that has the minimum cost is selected. A testcube is generated 

20 in 4.40 for the selected target fault. In 4.50, bit stripping is performed, 
In 4.60 real cost is compared with estimated cost, If real cost is 
greater than estimated cost plus a predetermined error, in 4.60 the 
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process returns to 4.30 if there are still unselected faults. A testcube 
that has a minimum real cost is chosen in 4.80 if no unselected faults 
remain. If real cost is not greater than estimated cost plus a 
predetermined error in 4.60, the test cube generated in 4.40 is 
retained as the selected test cube. Finally, all faults detected by the 
selected test cube are marked in 4.90. 

As all other ATPG algorithms, the conventional PODEM also has 
non-polynomial worst case time complexity and hence may fail to 
generate a test for some faults or identify their redundancy due to 
backtracks that exceeds a predefined backtrack limit. These faults are 
called aborted faults. Experiments were conducted to check whether 
backtracks increase due to the new cost functions used by the 
disclosed improved ATPG. For comparison, a PODEM that used the 
cost function defined in L H. Goldstein was implemented to reduce 
backtracks generate testcubes for the same faults. See L. H. Goldstein 
and E. L. Thigpen, SCOAP: Sandia Controllability/Observability 
Analysis Program, in Proceedings IEEE- ACM Design Automation 
Conference, pages 190-196, 1980. When the same backtrack limit, 
500, was used, the numbers of aborted faults in this case were close 
to those aborted by the disclosed improved ATPG for all circuits. This 
implies that the large number of backtracks in some circuits results 
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from the limitation of PODEM algorithm, on which the disclosed ATPG 
is an improvement, rather than due to the new cost functions. 

If test generation for a fault is aborted, a further improvement 
comprised invoking a SAT-based ATPG, which is implemented based 
on TEGUS, to find necessary input assignments for the aborted fault. 
For more background information on TEGUS, see P. Stephan, R. K. 
Brayton, and A. L Sangiovanni-Vincentelli. Combinational Test 
Generation Using Satisfiability. IEEE Trans, on Computer-Aided Design 
of Integrated Circuit and System, Vol. 15(9), Sep. 1996. 

The necessary input assignments found by the SAT-based ATPG 
is further processed to find more necessary input assignments by 
conflict analysis. . During conflict analysis, each input p k that is not 
assigned a necessary value to detect the fault by the SAT-based ATPG 
is set to a 0 first and the 0 at input p k is propagated into internal 

circuit lines. If a fault effect (D or D) propagates to any of primary 
outputs after the propagation process, the cube that is composed of 
the current values assigned at primary inputs, which are assigned by 
the SAT-based ATPG and conflict analysis, is a testcube for the fault. 
Otherwise, if the assignment of a 0 at input p k : (1) blocks all x-paths 
in the circuit, (2) removes all gates from the D frontier, or (3) assigns 
a binary value v to the fault site that is the same as the fault value, s- 
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a-v, then input p k must be assigned a 1 to detect the fault i. e., the 1 
is the necessary value at input p k to detect the fault. For more 
background information, see M. Abramovici, M. A. Breuer, and A. D. 
Friedman. Digital Systems Testing and Testable Design. Computer 
Science Press, New York, N.Y., 1990. 

Conflict analysis is repeated for the assignment of a 1 at input 
p k . If the assignment of both 0 and 1 at the input causes at least one 
of the three conflict conditions described above, the fault is redundant. 
Since necessary values for more inputs can be identified after the 
necessary input value for an input is assigned to the input, conflict 
analysis is repeated until no necessary input assignment is further 
identified. The computed necessary input assignments for the aborted 
fault are passed to the disclosed improved ATPG, to finally generate a 
testcube for the fault. 

IV.B. Merging Overriding Signals 

After testcubes for all hard-to-detect faults are generated, 
overriding signals are built by expanding the final glob _gen as shown 
in LB. 2(c). If gg k is assigned a 1, an OR gate is assigned to input p k . 
Input p k is fixed to a 1 by setting an input of the OR, s w , to a 1 while 
generator gen(C) with g[ = 1 is applied as shown in Figure 3. 
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Similarly, if gg k is assigned a 0, an AND gate is assigned to input p k . 
Input p k is fixed to a 0 by setting an input of the AND, s^o, to a 1 while 
generator gen(C) with g[ = 0 is applied. If g\ is assigned U in 

generator gen{C), there are input conflicts in testcube C. Hence, 
input p k cannot be fixed and all overriding signals assigned to input p k 
should be set to a 0 while gen(C) is applied. If gg k is assigned B, an 
OR and a AND gate are assigned to input p k . Input p k is driven by the 
OR gate that is driven by a corresponding overriding signal s m and the 
AND gate that is driven by the other overriding signal s m (see 
overriding signals Si /0 and s in in Figure 3). When input p k needs to be 
fixed to a 1 (while gen(C) with g\ = 1 is applied), the overriding 
signal s k/ i is set to a 1 and the value of s k/0 is don't care at this time. 
In contrast, when p k is required to be fixed to a 0 (while gen(C 6 ) with 
g\ = 0 is applied), s k/ i must be set to a 0 not so as to override the 
value at the output of the AND gate, 0. If g\ is assigned X in 
generator gen(Q'), the state of the corresponding overriding signal s 
and/or s k/ - is don't care X while gen(C) is applied. However, if g[ is 

assigned U in generator gen(C), input p k must be assigned both 1 and 
0. Hence, the corresponding overriding signal s k/v and/or s - should 
be assigned 0's not to override random pattern signal r k . 
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Any two overriding signals s a/y and s b / z (a * b) and y, z e 
{0,1.x} are not compatible if s a/y = *v and s b / z = v (v e {0, 1}) in any 
generator. Otherwise, s a / y and s a/2 are compatible. Compatible 
overriding signals can be merged into one overriding signal. For 
example, in Figure 2, Si/i and s 3/1 are not compatible because to apply 
generator gen(C 2 ), s in should be set to a 0 while s 3/1 to a 1. On the 
other hand, Si /0 and s 2 /o are compatible. 

Considering inverse relation can expand the definition of 
compatible inputs. Note that, for example in Figure 2, in every 
generator where overriding signal s lfl is assigned v (v = 0 or 1), 

overriding signal s 3 /i is assigned v or X. Hence, if overriding signal s 3/ i 
is inverted before it drives the two input OR gate that fixes input p 3 
while Si/i drives the two input OR gate that fixes input p x directly, then 
s 3/ i and S1/1 can be merged. Such overriding signals s a/y and S b / Z (a * 
b) are said to be inversely compatible. Finding minimum number of 
overriding signals can be formulated as maximum independent set 
problem. For more background information, see J. A. Bondy and U. S. 
R. Murty. Graph Theory with Applications. American Elsevier 
Publishing Co. Inc., New York, N.Y., 1982. 

Figure 5 shows an implementation for the generators and 
overriding signals shown in Figure 2 where compatible overriding 
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signals are merged. Note that while DECODER of the BIST 
implementation for the same generators and overriding signals shown 
in the conventional architecture shown in Figure 3 has 6 outputs, 
DECODER of the implementation corresponding to the disclosed 
technique has only 3 outputs. Compatible overriding signals s 1/0 , s 2 /o, 
and s 4/ i are merged and driven by the same output of DECODER. s 3/1 , 
which is inversely compatible with s 1/lr is driven by the same 
DECODER output as Si/i through an inverter. s 4/ i, which has no 
compatible or inversely compatible input, is not merged with any other 
input. 

Overall algorithm for the disclosed improved ATPG design is 
outlined below. 

1. i <- 0, glob^gen ^- {/V,/V,...,/V}. 

2. Initialize the current testcube set, C <-((>, unmark all faults 
in the fault list, and generator, gen(°) = {X,X,....,X}. j ^- 
0. 

3. If there are no more faults in the fault list, then go to 6. 
Generate a testcube d by the disclosed improved ATPG. 

4. Add the testcube c J to the current testcube set, CI { 0? U 
c'. j <— j + 1. Mark faults detected by testcube d\ 

5. If the number of conflicting inputs of any testcube in C is 
greater than M (M is a positive integer), then C <- C d f i 
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<r- i + 1, update g!ob_gen f and go to Step 2. Otherwise, 
update gen(C) and go to Step 3. 

6. Generate 3-weight WRPT patterns by fixing inputs or 
applying pure random patterns to inputs according to 
gen(C). Run fault simulation to drop the faults that are 
detected by generated 3-weight WRPT patterns. 

7. Merge compatible overriding signals. 

IV. C. Test Sequence for Each Generator 

Assume that an rn-bit LFSR is used to apply random patterns to 
a CUT with m inputs. The LFSR has been known to be the most cost- 
effective random pattern generator. In this sub-section the pattern 
length required to detect all faults in F } when an LFSR is used to 
generate random patterns is computed. The escape probability of fault 
f is the probability that fault f will not be detected even after t test 
patterns are applied. For more background information on escape 
probability, see P. H. B., W. H. McAnney, and J. Savir. Built-in Test for 
VLSI: Pseudorandom Techniques. John Wiley & Sons, 1987. If 
repetition of any test pattern is not allowed, e.g., t test patterns, 
where t < 2 m f are generated by an m stage LFSR that has primitive 
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polynomial, the escape probability of fault f for t test patterns is given 
by 

where V> is the number test vectors for fault f. If 2 im >> t f Equation 8 
can be approximated as follows: 

ep/(t)*(i-^)'- (9) 

The random pattern length required to detect fault f with escape 
probability no larger than e is given by 

T=\ ^ 1. (10) 

Let F = {f f f f ... f 1 } be faults which are detected by testcubes in 
testcube set C. To detect all faults in F h a random sequence long 
enough to detect the fault with the lowest detection probability should 
be applied. If a new pattern is generated each cycle, then T patterns 
can be generated in a time interval 7. Hence time interval T and test 
sequence length T can be used interchangeably. The length of such a 
sequence, T maxr is given by 

Tmax{C % ) = ^^[^ (1 i°^ /2m) ]- (ID 
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Since the number of conflicting bits for all testcubes is greater than or 
equal to M, the detection probability of any fault is less than or equal 
to 1/2 M . Hence, Equation 11 can be rewritten as 
follows: 

t - r l °9 e i ( 12 ) 

Imax ~ l log{l-l/2 M ) 1 ' V ' 

According to the exhaustive experiments performed, when e 
less than 0.2 is used, all 2 m M-bit patterns could be applied to any M 
inputs. Table 1 shows the pattern length T max required for different 
M's when e = 0.2 and 0.1. 
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Table 1: Pattern Length for e = 0.1,0.2 



e 


M 


Test Length 


0.2 


3 


12 


0.2 


6 


102 


0.2 


9 


823 


0.2 


12 


6591 


0.1 


3 


17 


0.1 


6 


146 


0.1 


9 


1177 


0.1 


12 


9430 



In the case that the clock for T max is generated by dividing the 
test clock by powers of 2 for simple hardware, the interval can be 
obtained by computing the smallest power of 2 that is greater than or 
equal to T max . 

If all 2 M distinct Af-bit patterns are applied to M conflicting 
inputs, while the other inputs are fixed according to generator gen(C) 
during time interval 7~, all faults in F\ are guaranteed to be detected. 
The patterns to satisfy the above condition can also be generated by 
test pattern generators designed for pseudoexhaustive testing. 
Syndrome-driver counter, constant-weight counter, combined 
LFSR/SR, combined LFSR/XOR, and condensed LFSR are test patterns 
that can generate such patters. For further background information on 
constant-weight counter, see D. T. Tang and L. S. Woo, Exhaustive 
Test Pattern Generation with Constant Weight Vectors. IEEE Trans, on 
Computers, Vol. C-32(12), December 1983; and E. J. McCluskey, 
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Verification Testing — A Pseudoexhaustive Test Techniques, IEEE 
Trans, on Computers, Vol, C-33(6), June 1984. For further background 
information on combined LFSR/SR, see D. T. Tang and C. L. Chen, 
Logic Test Pattern Generation Using Linear Code, IEEE Trans, on 

5 Computers, Vol. C-33(9), September 1984. For further background 
information on combined LFSR/XOR, see S. B. Akers, On the Use of 
Linear Sums in Exhaustive Testing, in Proceedings IEEE International 
Conference on Fault-Tolerant Computing, pages 148-153, 1985. For 
further background information on condensed LFSR, see L.-T. Wang 

10 and E. J. McCluskey, A New Condensed Linear Feedback Shift Register 
Design for VLSI/System Testing, in Proceedings IEEE International 
Conference on Fault-Tolerant Computing, pages 360-365, 1984; and 
L.-T. Wang and E, J. McCluskey. Condensed Linear Feedback Shift 
Register (LFSR) Testing — A Pseudoexhaustive Test Techniques. IEEE 

15 Trans, on Computers, Vol. C-35(4), April 1986. 

IV.D. Application to Test-Per-Scan BIST 

This sub-section discusses preferred embodiments of improved 
test-per-scan BIST circuit architecture. Contrary to Touba'96 and 
20 Wunderlich et al, since the disclosed BISTs generate random 
sequences during each period 7 that are long enough to apply all 2 M M- 
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bit patterns to conflicting inputs (see IV. C), fault coverage achieved by 
the disclosed BISTs is not dependent on the random pattern 
sequences generated. See N. A. Touba and E. J. McCluskey, Altering 
a Pseudo-Random Bit Sequence for Scan-Based BIST, in Proceedings 
5 IEEE International Test Conference, pages 167-175, 1996; and H.-j. 
Wunderlich and G. Kiefer. Bit-Flipping BIST, In Proceedings VLSI 
Testing Symposium, pages 337-343, 1996. Two different types of 
test-per-scan BISTs are disclosed. Preferred embodiments embodying 
these two architectures are discussed herein. 

10 IV.D.1. Parallel Type Test-Per-Scan BIST 

The first type of test-per-scan BIST, which is called the parallel 
type test-per-scan BIST, has some similarities to the test-per-clock 
BIST. Figure 6, shows an implementation of parallel type test-per- 

15 scan BIST embodying the disclosed techniques. The same example 
circuit for the generators and overriding signals shown in Figure 2 is 
used in the Figure 6. Like the test-per-clock BIST, the overriding 
signals are generated from DECODER 6.10 with the outputs of 
COUNTER 6.30 as inputs. However, unlike the test-per-clock BIST 

20 where overriding signals drive corresponding AND and OR gates, in 
parallel type of test-per-scan BIST, the overriding signals drive S 
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(synchronous preset) and R (synchronous reset) pins of scan elements 
in the scan chain 6.60-6.64. Hence, in the parallel type test-per-scan 
BIST, scan elements h,- (i 1,2,.,.) that do not have S or R pins should 
be replaced by scan elements that have an S and/or R if scan input p lr 
5 which is driven by scan element h }f is assigned a 1, 0, or B in the 
global generator glob__gen. If gg } - 1, then scan element h } is replaced 
SI by a scan element 6.62 with a S pin. If gg } - 0, then scan element hi is 

M be replaced by a scan element 6.61 with an R pin. And finally, if gg- s = 

B, then scan element h } is replaced by a scan element 6.63 with both 

■ski? 

^ 10 an S and R pins. If a scan element already has a high active S pin 
whose corresponding scan input is assigned a 1 in at least one 
rr generator, then a two input OR gate or a two input AND gate if the S 

m pin is low active — should be inserted between the S pin and the 

normal preset signal. Likewise, if a scan element already has a high 
15 active R pin whose corresponding scan input is assigned a 0 in at least 
one generator, then a two input OR gate or a two input AND gate if the 
R pin is low active — should be inserted between the R pin and the 
normal reset signal. For example, the scan element 6.60 driving p ± of 
the circuit shown in Figure 6 had an S pin in the original circuit, which 
20 is controlled by SET/PRESET logic during the circuit's normal operation, 
and hence an OR gate is inserted between overriding signal Si /0 and 

51 



the S pin of the scan element 6.60. The LFSR 6.20 generates the 
random test patterns. 

Another difference between the test-per-clock BIST and the 
parallel type test-per-clock BIST is that while input pin EN of DECODER 
3.10 is directly driven by ENABLE 3.30 in the test-per-clock BIST, in 
the parallel type test-per-clock BIST, EN of DECODER 6.10 is driven by 
a two input AND gate 6.40 whose two inputs are driven by LAST_SCAN 
and ENABLE signals, respectively. The test-per-scan BIST requires n 
test clocks, where n is the number of- scan elements in the scan chain, 
to scan a test vector into a scan chain during a scan shift operation. 
Then the scan elements in the scan chain are configured in their 
normal mode to capture the response to the scanned in test vector. 
LAST_SCAN is set to a 1 for only one cycle at the end of each scan 
shift operation and set to a 0 during all other cycles. Hence, DECODER 
6.10 is enabled only at the end of a scan shift operation when a 
vector, which is generated by a random pattern generator, is fully 
loaded into the scan chain. When DECODER is enabled, the overriding 
signals modifies the random vector, which is loaded into the scan 
chain, by activating S and/or R pins of the scan elements that are 
assigned a binary value in the generator that is currently being 
applied. 
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The area overhead of a parallel type test-per-scan BIST is close 
to that of a test-per-clock BIST. Also the merging technique to reduce 
overriding signals, which is described in IV.B, is applicable to the 
parallel type test-per-scan BIST. 

IV.D.2. Serial Type Test-Per-Scan BIST 

A preferred embodiment of the disclosed second type of test- 
per-scan BIST, which is called the serial type test-per-scan BIST, is 
shown in Figure 7. An OR 7.20 and AND 7.30 gate each of which is 
driven by two output signals of DECODER 7.40, DO and Dl, 
respectively, alters the random pattern sequence generated by an 
LFSR 7.10. COUNTER 7.50 and SCAN-COUNTER 7.60 determine the 
states of two output signals of DECODER. SCAN-COUNTER is an (n + 
l)-modulo counter where in is the number of scan elements in the 
scan chain 7.70. At least [login)] stages of counter is necessary for 
SCAN-COUNTER. Note that SCAN-COUNTER is required by all test- 
per-scan BIST techniques and not particular to implement the 
disclosed 3-weighted WRPT BIST. 

Like the parallel type test-per-scan BIST and test-per-clock 
BIST, the area overhead of the decoder of the serial type test-per-scan 
BIST is determined by the number of generators (the number of 
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stages of COUNTER) and the number of overriding signals specified in 
the global generator, glob_gen. Typically, circuits that have may scan 
elements will require many overriding signals if they are equally 
random pattern resistant. Hence, such circuits many require high area 
5 overhead to implement DECODER. In the test-per-clock BIST and 
parallel type test-per-scan BIST, all inputs that are assigned binary 
values in the generator that is currently being applied are fixed to the 
corresponding binary values in parallel at the same cycle and hence 
all compatible overriding signals can be merged into one overriding 



Cf 10 signal. 

^ However, the serial type test-per-scan BIST has only two 

fj overriding signals and values for scan inputs that are assigned binary 

%l values in the generator that is currently being applied are altered 

h% 

serially (at different cycles) by the two overriding signals. Hence, the 
15 technique to merge compatible overriding signals to reduce area 
overhead of DECODER, which is described in IV.B, is not applicable to 
the serial type of test-per-scan BIST. In the serial type test-per-scan 
BIST, area overhead of DECODER can be reduced by inserting toggle 
flip-flops 9.10 and 9.20 between outputs of DECODER and the AND 
20 and OR gates as shown in Figure 9. 

Consider implementing a serial type test-per-scan BIST for two 
generators, gen(C°) and geniC 1 ), shown in Figure 8. First, consider an 
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implementation shown in Figure 7 where the outputs of DECODER are 
connected to the AND and OR gates directly so that one output of 
DECODER, D 0f is connected to an input of the AND and the other 
output of DECODER, D lf is connected to an input of the OR gate. 
5 Assume that the initial value of SCAN-COUNTER is 0 and the value for 
scan input p 0 is scanned in first and that for scan input p 7 is scanned in 
last. At the positive (or negative) edge of each scan shift cycle, SCAN- 
COUNTER increases by 1 and a new value for a scan input is scanned 
into the scan chain. This is repeated until the new values for all 8 scan 
10 inputs are scanned in. If currently COUNTER=/ and SCAN- 
COUNTER^*, then the value for p jf which is determined by g l jf is 

scanned in; if gj - 1, then a 1 is scanned in, and if gj - 0, then 0 is 

scanned in, and if gj = U or X f then a random binary value is scanned 

in. 

15 In Figure 8, gl , g° 5 , and gl are assigned I's in gen(CP) and g\, 

g\, and g] are assigned l's in gen{C x ). Hence, Si is set to a 1 when 

COUNTER = 0 and SCAN-COUNTER = 3, 5, and 6 and when COUNTER 
= 1 and SCAN-COUNTER = 4, 5, and 7. The on-set of the function for 
Si has 6 minterms. Similarly, s 0 is set to 1 when COUNTER=0 and 
20 SCAN-COUNTER = 1, 2 and 4 and COUNTER=l and SCAN- 

55 



COUNTER=0, 2, 3. The on-set of the function for s 0 has also 6 



minterms (see Figure 8 (b)). 



Figure 9 shows a different implementation for the serial type 



test-per-scan BIST described in the previous paragraph. Note that 



5 toggle flip-flop TF 1 9.10 is inserted between an output D x of DECODER 



9.30 and an input of the OR gate 9.40 and toggle flip-flop TF 0 9.20 is 



O inserted between the other output of DECODER D 0 and an input of the 

03 AND 9.50 gate. Assume that toggle flip-flop TF 0 is initialized to a 1 



and toggle flip-flop TF X is initialized to a 0 before each scan shift 



10 operation starts. The COUNTER and SCAN„COUNTER are 9.60 and 
9.70, respectively. First, consider applying ge/7(C°). The state of TFi, 



which stays at a 0 until COUNTER=0 and SCAN-COUNTER=2, toggles 



to a 1 when the value for p 3 is scanned i.e., COUNTER=0 and SCAN- 



COUNTER=3. p 3 is the first scan input in the scan chain that is 
15 assigned a 1 in ge/?(C°). The state of TF± should toggle to a 0 at the 



next cycle, SCAN-COUNTER=4, since g° 4 = 0. The state of TF 1 should 



toggle one more time at COUNTER=0 and SCAN-COUNTER=5 and stay 



at a 1 until the end of scan shift cycles. While COUNTER=l i.e., 



gen^C 1 ) is being applied, the state of TF± needs to toggle only once 



20 when SCAN-COUNTER=4 since p 4 is the first scan input in the scan 
chain that is assigned a 1 in gen^C 1 ) and all scan inputs following p 4 in 
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the scan chain are assigned only X or 1 in genfC 1 ). As a conclusion, 
the on-set of the function for Si has only 4 minterms; COUNTER = 0 
and SCAN-COUNTER = 3, COUNTER = 0 and SCAN-COUNTER = 4. 
COUNTER = 0 and SCAN-COUNTER = 5, and COUNTER = 1 and SCAN- 
COUNTER = 4. Similarly, the on-set of the function for s 0 has also 4 
minterms (see Figure 8 (c)). Note that if g) = 1, i.e., TF 1 = 1, the 

state of TF 0 is don't care since the output of the OR gate can be 
assigned a 1 by TF X = 1 independent of the state of TF 0 . Hence, the 
state of TF 0 at COUNTER=0 and SCAN-COUNT=5 need to be changed. 
In this example, though the number of minterms is reduced only by 4 
by inserting the toggle flip-flops, in practical circuits that have long 
scan chains, inserting toggle flip-flops can drastically reduce the 
number of minterms. 

In order to obtain DECODER that requires minimum area 
overhead, scan chains should be reordered such that the toggle flip- 
flops change their states the minimum number of times during scan 
shift operations for each generator. In this disclosure, the problem of 
reordering a scan chain is achieved by using genetic algorithms. For 
further background information on Genetic Algorithms, See_D. E. 
Goldberg. Genetic Algorithms in Search, Optimization, and Machine 
Learning. Addison Wesley, Reading, M.A., 1989. 
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Genetic algorithms have been widely used to solve NP-complete 
problems in reasonable computation time since introduced by Holland. 
See J. H. Holland, Adaptation in Natural and Artificial Systems, 
University of Michigan Press, Ann Arbor, M.L, 1975. Genetic 
5 algorithms start with an initial population that are randomly generated. 
A genetic code is assigned to each individual and a fitness function 
© that is calculated from the genetic code is given to each individual. 

Individuals in each generation survive or vanish through a genetic 

if SSK 

evolutionary process. Individuals that have better fitness functions are 
/p 10 given higher chance to survive and to be forwarded to the next 

generation. The surviving individuals often produce their offspring 

individuals that have partly different genetic codes from their genetic 
O codes by exchanging their genetic codes with other individuals 

(crossover process) and flipping a part of their genetic codes (mutation 
15 process). 

In the genetic algorithm used in this disclosure, a permutation 
of scan elements in the scan chain is encoded as the genetic code of 
each individual. The objective of the genetic algorithm used is to find 
a order of scan elements that leads to the function of the decoder that 
20 has the minimum number of minterms. A nonnegative fitness 
function, which denotes the number of toggles required at the two 
toggle flip-flops to alter random test sequences according to values 



assigned in each generator, is calculated for each individual in the 
current generation: 

Cost = '£Y i ( T °99leo(iJ)+ T °99lei(hj)), ( 13 ) 



1=1 j=i 



where d is the number of generator and n is the number of scan inputs 
5 and Toggleo(i, j) and Toggle^iJ) are given as follows: 



Toggle Q (i,j) = 



f 1 if (TF 0 (i, j - 1) = 0 and g) = 0) 
or(rF 0 (i,j-l) = land 5 } = i7) 

0 otherwise 



le\(i,j) = | 



1 if (TF 1 (i,j-l) = land^ = Oor 5 j = C/) 
10 Toggle x (i,j) = { or (TF x (i, j - 1) = 0 and g) = 1) 

0 otherwise, 



(14) 



where TF Q {I, J-l) and TF l (i / j-1) are the states of the toggle flip- 
flops at the previous scan shift cycle. Initially (/ = 1), TF 0 (i, j) = 1 and 
15 7Fi(/, j) = 0 and thereafter 7F 0 (/, ;') (7Fi(/, ;')) toggles its state 
whenever ToggIe Q (i,j) = 1 {Toggle^iJ) = 1). 

As compatible overriding signals can be merged in the test-per- 
clock BIST and the parallel type test-per-scan BIST, in the serial type 
test-per-scan BIST, compatible scan inputs can be merged before scan 
20 chains are ordered by the genetic algorithm. Any two scan inputs, p, 
and pj, are compatible if the value of g. is the same as that of g° or 
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either g? or g° is assigned X in every generator; otherwise p } and p } 

are not compatible. If compatible scan inputs are merged into a group 
of scan inputs and groups of compatible scan inputs are ordered 
instead of individual scan inputs, the run time of the genetic algorithm 
5 can be reduced to obtain the order that leads to the minimum number 
of minterms by reducing the number of inputs to the genetic 
algorithm. 

Due to routing or load capacitance constraint (or timing 
constraint) of scan elements, some scan elements should be placed in 

10 neighbor not to violate routing and/or load capacitance constraint. 
This limitation in reordering scan chains can be relaxed by exploiting 
the compatibility of scan inputs. The toggle flip-flops of the serial type 
test-per-scan BIST do not toggle their states while values for the 
compatible scan inputs are scanned in. Hence, scan inputs in a group 

15 of compatible scan inputs can be rearranged to satisfy routing or load 
capacitance constraint of scan elements without increasing the number 
of minterms of the function to implement the decoder that generates 
overriding signals. Typically, significant number of scan inputs are 
assigned X in every generator, These scan inputs can be placed in any 

20 position of the scan chain since they are compatible with any other 
scan input in the scan chain. If such scan inputs are placed in proper 
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positions, routing or load capacitance constraint of scan elements can 
be satisfied without increasing the number of minterms to implement 
the decoder. 

Typically, the serial type test-per-scan BIST requires more 
hardware overhead to implement the decoder than the parallel type 
test-per-scan BIST. However, since the serial type test-per-scan BIST 
does not require scan elements that have R and/or S pins or extra two 
input OR or AND gates that are required to fix scan inputs in the 
parallel type test-per-scan BIST, overall hardware overhead to 
implement the serial type test-per-scan BIST is comparable to that to 
implement the parallel type test-per-scan BIST. Furthermore, 
compared to the parallel type test-per-scan BIST, which requires 
routing overhead to connect overriding signals from the output pins of 
the decoder to corresponding S and/or R pins of scan elements that 
need to be fixed, the serial type test-per-scan BIST, which has only 
two overriding signals, requires very little routing overhead. Hence, if 
the design in which the BIST is to be inserted has routing congestion, 
the serial type test-per-scan BIST will be a better choice. 
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IV.E. Experimental Results 

Results of simulation experiments using the disclosed BIST are 
reported in Table 2. The experiments were conducted with logic 

5 synthesis and ISCAS benchmark circuits. Most of logic synthesis 
benchmark circuits have many more hard-to-detect faults than ISCAS 
benchmark circuits. Also to demonstrate the scalability of the 
disclosed improved ATPG, the experimental results for large ISCAS 
benchmark circuits are reported. The experiments were performed on 

10 a 300 MHz Sun Ultra 2 with 1 Gbytes of memory. 

In the experiments for the test-per-clock BIST, it is assumed 
that the outputs of all flip-flops of circuits to be primary inputs and the 
inputs of all flip-flops to be primary outputs. Also, in the experiments 
for test-per-scan BIST, it is assumed that all primary inputs are driven 

15 by the same scan chain that contains all flip-flops in the circuit all of 
which are transformed to scan elements. 

The column labeled # PI shows the number of primary inputs 
and # lines the number of circuit lines of each circuit. The column 
labeled # HF reports the number of faults that remain undetected after 

20 a set of pseudo-random patterns are applied. The numbers in 
parentheses in the same column are the numbers of the pseudo- 
random patterns that are applied to detect easy-to-detect faults (for 
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short, easy faults). The column labeled EFC reports fault coverages 
that are obtained by applying pseudo-random patterns. In order to 
demonstrate the feasibility of applying the disclosed technique to 
circuits with significant number of hard-to-detect faults, very short 
5 random pattern sequences are applied. The column labeled # RPL 
reports the number of pure pseudo-random patterns that are 
61 generated by m-stage LFSRs, where m is the number of inputs, to 

Of achieve 100% or close to 100% fault coverage. The column labeled 

Tien shows the total number of test patterns that are generated by the 
X 10 disclosed BISTs to achieve 100% fault coverage and the numbers in 
a the parentheses in the same column are the reduction factor in test 

sequence length due to the disclosed BIST techniques, m-stage LFSRs 
CI (where m is the number of inputs) are used to apply random patterns 

to conflicting inputs. The column labeled M shows the maximum 
15 conflicting inputs of each testcube set. The column labeled T shows 
the periods during which each generator is applied to circuits. The 
column labeled # G shows the number of generators generated by the 
disclosed improved ATPG. The column labeled # before OS shows the 
number of overriding signals before compatible overriding signals are 
20 merged and # after OS shows the number of overriding signals after 
compatible overriding signals are merged. Run times reported in the 
column labeled Run Time include fault simulation time for entire test 
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sequence (pseudo-random pattern sequence applied to detect easy 
faults and 3-weight WRPT sequence) generated by the disclosed BIST 
as well as the disclosed improved ATPG run time. Backtrack limits of 
500 are used for all circuits. 

The heading DECODER reports hardware overhead due to 
implementing the decoders, which generate overriding signals, as gate 
equivalents. The column labeled TPCreports hardware 
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Table 2: Experimental Results 
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overhead for test-per-clock BIST implementations and TPS for serial 
type test-per-scan BIST implementations. In order to demonstrate 
that inserting two toggle flip-flops and reordering scan chains (See 
Section 6.2) can reduce hardware overhead to implement the 
5 decoders, gate equivalents for both implementations are reported: 
implementation with toggle flip-flops and re-ordered scan chains 
(numbers at the left-hand side of the column) and implementation 
without toggle flip-flops (numbers in the parentheses in the column). 
The decoder circuits are obtained by running SIS for two level circuit 

10 implementations. Only NAND and NOR gates and inverters are used to 
synthesize the decoders. The gate equivalents are computed in the 
manner suggested in J. Hartmann: 0.5n for an in-input NAND or NOR 
gate and 0.5 for an inverter. See J. Hartmann and G. Kemnitz, How 
to Do Weighted Random Testing for BIST, in Proceedings IEEE 

15 International Conference on Computer-Aided Design, pages 568-571, 
1993. 

100% fault coverage is attained for all benchmark circuits. The 
test length required to achieve 100% fault coverage is dramatically 
reduced by the disclosed BIST technique. It is reported in M. F. 
20 AlShaibi that s838 requires more than 100 Meg test patterns to 
achieve 100% fault coverage. See M. F. AlShaibi and C. R. Kime, 
MFBIST: A BIST Method for Random Pattern Resistant Circuits, In 
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Proceedings IEEE International Test Conference, pages 176-185, 1996. 
However, test patterns generated by the disclosed BIST achieve the 
same fault coverage with only 512 + 6 x 512 = 3,072 test patterns, 
where 512 is the number of pure random patterns applied to detect 
5 easy faults. This implies that test length required to detect all faults is 
reduced by factor of about 32 x 10 6 . Clearly greater reductions are 
achieved for the random pattern resistant circuits that require long 
pure random test sequence, such as xparc, s838, c5315, and c7552 
^ where test sequence lengths are reduced by several orders of 

[{{ 10 magnitude. It should be noticed that only 26 generators are required 
T for s38417 that requires the largest number of generators even though 

Jj unrealistically large number of hard-to-detect faults, 3428 faults, are 

y* used, This is due to the fact that many of tests for hard-to-detect 

M faults share common input assignments for their detection. Since the 

15 test sequence length required for the disclosed BIST is determined by 
the number of generators, this demonstrates that the test sequence 
length will not grow even in circuits that have large number of hard- 
to-detect faults. In fact, even though significant number of hard-to- 
detect faults are used for all circuits, test sequence lengths of 3-weight 
20 WRPT are even shorter than those of pseudo-random test sequences, 
which are applied to detect, easy faults in most circuits. This indicates 
that test sequences with reasonable lengths can be obtained for large 



M so that hardware overhead to implement the disclosed BISTs will be 
low in large practical designs. 

To estimate the hardware overhead that should be added to 
pseudo-random BIST circuitry such as LFSR's to implement the 

5 disclosed BIST techniques, the gate equivalents of the decoders that 
generate overriding signals are listed. The number of the AND and/or 
OR gates, which should be also added to pseudo-random BIST circuitry 
to override random signals in test-per-clock BISTs (see Figure 3), is 
the same as the number of overriding signals before the compatible 

10 inputs are merged /.e. the numbers listed in the column labeled before 
#OS. The number of flip-flops to implement the counter that selects 
each generator can simply be computed by [log 2 {the number of 
generators)] and hence grows very slowly as the number of generators 
grows. Only 5-bit counter is enough for s38417 that requires 26 

15 generators. Very little hardware is required to implement the decoders 
for test-per-clock BISTs and parallel type test-per-scan BISTs. Even 
for large circuits such as s38417 with 3428 hard faults, the gate 
equivalent is only 169.5. Although serial type test-per-scan BISTs 
require more hardware over-head to implement the decoders than 

20 test-per-clock BISTs, hardware overhead to implement the decoders 
will not have significant area impact. For example, the gate equivalent 
required to implement the decoder of a serial type test-per-scan BIST 



for s38417 whose combinational portion has about 24,500 gate 
equivalents are only 626, in other words, only 2.6% of the gate 
equivalents of the combinational portion of s38417. This clearly 
demonstrate that the disclosed BIST is applicable to large circuits with 
large numbers of hard-to-detect faults at affordable cost. 

The experimental results show that hardware overhead to 
implement serial type test-per-scan BISTs can be significantly reduced 
by inserting toggle flip-flops and reordering scan chains. Except a few 
circuits such as xparc, bcO, and rckl (even for such circuits, increase in 
gate equivalents is negligible), significant reduction in gate equivalents 
is obtained for most circuits. Especially, large reduction is obtained for 
large circuits such as s9234 (76% reduction), sl3207 (68.5% 
reduction), s38417 (66.4% reduction), and s38584 (66.3% reduction). 

The run time reported in the table clearly demonstrates the 
feasibility of the disclosed improved ATPG. Experiments show that 
even more run times are spent by fault simulation that is performed to 
measure fault coverage obtained by the disclosed BIST sequences 
rather than the disclosed improved ATPG run times in most circuits. 

Table 3 compares the results of the disclosed test-per-clock 
BIST with those of other test-per-clock BIST methods, which use 
similar random pattern testing techniques, on the test sequence length 
(heading Tien) and hardware overhead added to pure pseudo-random 
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BIST. For more background information on these other methods, See 
I. Pomeranz and S. Reddy, 3-Weight Pseudo-Random Test Generation 
Based on a Deterministic Test Set for Combinational and Sequential 
Circuits, IEEE Trans. On Computer- Aided Design of Integrated Circuit 

5 and System, Vol. 12:1050-1058, July 1993; M. Bershteyn, Calculation 
of Multiple Sets of Weights for Weighted Random Testing, in 
Proceedings IEEE International Test Conference, pages 1031-1040, 
1993; M. F. AlShaibi and C. R. Kime, Fixed-Biased Pseudorandom 
Built-in Self-Test For Random Pattern Resistant Circuits, in 

10 Proceedings IEEE International Test Conference, pages 929-938, 
1994; and N. Touba and E. McCluskey, Synthesis of Mapping Logic for 
Generating Trans-formed Pseudo-Random Patterns for BIST, in 
Proceedings IEEE International Test Conference, pages 674-682, 1995. 
Hardware overheads are estimated on the number of flip-flops 

15 FF and gates GE, separately. GE's for Bershetyn, Pomeranz et ai f and 
Touba'95 are computed as described in Touba'95. M. Bershteyn, 
Calculation of Multiple Sets of Weights for Weighted Random Testing, 
In Proceedings IEEE International Test Conference, pages 1031-1040, 
1993; I. Pomeranz and S. Reddy, 3-Weight Pseudo-Random Test 

20 Generation Based on a Deterministic Test Set for Combinational and 
Sequential Circuits, IEEE Trans, On Computer-Aided Design of 
Integrated Circuit and System, Vol. 12:1050-1058, July 1993; N. 
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Touba and E. McCluskey, Synthesis of Mapping Logic for Generating 
Trans-formed Pseudo-Random Patterns for BIST, In Proceedings IEEE 
International Test Conference, pages 674-682, 1995. 

Table 3: Comparisons (Test-Per-Clock BIST) 
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M. Bershteyn uses conventional weighted random pattern BIST 
with various weights. See M. Bershteyn. Calculation of Multiple Sets 
of Weights for Weighted Random Testing. In Proceedings IEEE 
International Test Conference, pages 1031-1040, 1993. The column 
5 WS shows the number of weighted sets. The number of flip-flops, FF, 
and gates, GE, are computed as follows: 
FF = log 2 (number of weight sets) 
GE = (4+(1.5)WS)(number of inputs in CUT). 

Pomeranz uses the similar technique to the disclosed BIST. See 

10 I. Pomeranz and S. Reddy. 3-Weight Pseudo-Random Test Generation 
Based on a Deterministic Test Set for Combinational and Sequential 
Circuits. IEEE Trans, On Computer-Aided Design of Integrated Circuit 
and System, Vol. 12:1050-1058, July 1993. A difference of Pomeranz 
from the disclosed BIST is in the way testcubes are generated. See I. 

15 Pomeranz and S. Reddy. 3-Weight Pseudo-Random Test Generation 
Based on a Deterministic Test Set for Combinational and Sequential 
Circuits. IEEE Trans. On Computer-Aided Design of Integrated Circuit 
and System, Vol. 12:1050-1058, July 1993. 

In contrast to the disclosed technique, in Pomeranz, the 

20 testcubes for all hard-to-detect faults are prepared before the 
procedure to find inputs to be fixed is applied. See I. Pomeranz and S. 
Reddy, 3-Weight Pseudo-Random Test Generation Based on a 
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Deterministic Test Set for Combinational and Sequential Circuits, IEEE 

Trans. On Computer-Aided Design of Integrated Circuit and System, 

Vol. 12:1050-1058, July 1993. 3-gate modules are used to fix a 

subset of inputs to deterministic vectors that are called expanded 

tests. The hardware overhead is computed as follows: 

FF = log 2 (number of expanded tests) 

GE = {number of 3-gate modules) (1+ average fan-in). 

In Touba'95 , random vectors that do not detect any new faults 
are mapped to the deterministic vectors for hard-to-detect faults by a 
mapping logic that is placed between the TPG and the inputs of a CUT. 
See N. Touba and E. McCIuskey, Synthesis of Mapping Logic for 
Generating Trans-formed Pseudo-Random Patterns for BIST, in 
Proceedings IEEE International Test Conference f pages 674-682, 1995. 
The column GE shows the gate equivalents for the mapping logics. 
The estimation of the gate equivalents is based on J. Hartmann: 
0.5r?G£'s for an n-input NAND or NOR, 2.5(n — l)GE's for an in-input 
XOR, and l.SGE's for a 2-to-l MUX. See J. Hartmann and G. Kemnitz, 
How to Do Weighted Random Testing for BIST, in Proceedings IEEE 
International Conference on Computer-Aided Design, pages 568-571, 
1993. Since the mapping logic can degrade circuit performance 
seriously, the authors recommend to use MUX's to bypass the mapping 
logic during normal operation. The numbers in parentheses in the 
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column GE include gate equivalents for such MUX'S. See M. F. AlShaibi 
and C. R. Kime, MFBIST: A BIST Method for Random Pattern Resistant 
Circuits, In Proceedings IEEE International Test Conference, pages 
176-185, 1996. 

5 M. F. AIShaibi'96 is an improved version of M, F. AIShaibi'94 

where multiple idler registers are used to apply random patterns or 
m fixed values to the inputs of a circuit. See M. F. AlShaibi and C. R. 

. £*: 

|| Kime, MFBIST: A BIST Method for Random Pattern Resistant Circuits, 

rji in Proceedings IEEE International Test Conference, pages 176-185, 
yp 10 1996; and M. F. AlShaibi and C. R. Kime, Fixed-Biased Pseudorandom 

* Built-in Self-Test For Random Pattern Resistant Circuits, in 

yj Proceedings IEEE International Test Conference, pages 929-938, 1994. 
|] A SFN cell is assigned to each input to be fixed. Each SFN cell, 

T5S.S? 

ph which is composed of two flip-flops, can function in four different 

15 modes by reconfigurating the connections of two flip-flops. For each 
configuration sequence, a new set of fixed values, which are stored in 
a memory device, are loaded into the corresponding SNF cells. The 
hardware overhead reported is the gate equivalents for only SFN cells 
(the number of fixed inputs): 
20 GE- {SFN C0St x F =7 x F). 

Unlike all other methods where hardware overheads for the 
BIST controllers, which may be complex circuits, are not counted, the 
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GEs for the disclosed BIST include not only the gates to fix inputs but 
also the gates to implement the BIST controllers. Nonetheless, GE's 
for the disclosed BIST are the smallest for most circuits. A key reason, 
it is believed, is the introduction of the disclosed improved ATPG. 
Compared to M. F. AlShaibi, which requires many flip-flops, the 
disclosed BISTs require very few flip-flops (only 5 flip-flops for the 
largest circuit). Also compared to M. F. AlShaibi that may require 
complex control scheme, the disclosed BIST has a very simple 
architecture. See M. F. AlShaibi and C. R. Kime, MFBIST: A BIST 
Method for Random Pattern Resistant Circuits, In Proceedings IEEE 
International Test Conference, pages 176-185, 1996. 

The test sequence lengths for some circuits of are shorter than 
those of the disclosed BIST. See N. Touba and E. McCluskey, 
Synthesis of Mapping Logic for Generating Trans-formed Pseudo- 
Random Patterns for BIST, In Proceedings IEEE International Test 
Conference, pages 674-682, 1995; and M. F. AlShaibi and C. R. Kime, 
MFBIST: A BIST Method for Random Pattern Resistant Circuits, In 
Proceedings IEEE International Test Conference, pages 176-185, 1996. 
However, in the disclosed technique, periods, T u during which 
generator gen{C) is applied are the same for every generator gen{C) f 
where / = 1,2,.... Hence, even though all faults that are targeted by 
generator gen(C) are detected before all 7/ test patterns are applied, 
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the disclosed BIST continues to generate test patterns for generator 
gen(C) until all 7} test patterns are generated. This increases test 
sequence length since many test patterns are wasted without 
detecting any new faults. If variable periods are used so that each 
generator is applied until all faults that are targeted by each generator 
are detected, then test sequence length can be reduced. However, 
this will make BIST controllers more complex. This is one of reasons 
that GE's for the disclosed BIST are the smallest for most circuits even 
though we include hardware overhead to implement BIST controllers. 
Another advantage of using the same one period for all generators is 
that it makes the implemented BISTs less dependent on random 
pattern sequences that are applied to inputs that are not fixed. This 
will allow feedback polynomials or seeds of LFSRs that are used to 
generate random pattern sequences for the disclosed BISTs to be 
changed even after BIST controllers are already implemented. Note 
that the periods that each generator is applied are powers of 2 (the 
periods that random pattern sequences to detect easy faults also 
powers of 2). This also increases test sequence lengths for the 
disclosed BISTs. However, this reduces hardware overhead. 

Table 4 compares the results of the disclosed test-per-scan BIST 
with those of other test-per-scan BIST methods. See N. A. Touba and 
E. J. McCluskey, Altering a Pseudo-Random Bit Sequence for Scan- 
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Based BIST, In Proceedings IEEE International Test Conference, pages 
167-175, 1996; H.-J. Wunderlich and G. Kiefer. Bit-Flipping BIST, In 
Proceedings VLSI Testing Symposium, pages 337-343, 1996; and M. 
Karkala, N A. Touba, and H.-J. Wunderlich. Special ATPG to Correlate 
Test Patterns for Low-Overhead Mixed-Mode BIST, In In proceedings 
7?rd Asian Test Symposium, 1998. Results are compared on the 
test sequence length and hardware overhead. Hardware overhead for 
the disclosed BIST is based on the serial type test-per-scan BIST 
implementation. 

In Touba'96 and Karkala et al, the columns labeled LFSR Size 
show the number of stages of LFSR's that are used to generate 
pseudo-random sequences and the columns labeled Seq. ID Reg. Size 
show the size of the sequence ID. registers, which drive one of inputs 
of the bit-fixing sequence generation logics. See N. A. Touba and E. J. 
McCIuskey, Altering a Pseudo-Random Bit Sequence for Scan-Based 
BIST, in Proceedings IEEE International Test Conference, pages 167- 
175, 1996; and M. Karkala, N A. Touba, and H.-J. Wunderlich, Special 
ATPG to Correlate Test Patterns for Low-Overhead Mixed-Mode BIST, 
In proceedings 7yrd Asian Test Symposium, 1998. The other input of 
the bit-fixing sequence generation logic is driven by Mod- 
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Table 4: Comparisons (Test-Per-Scan BIST) 
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(m+1) Counter, which corresponds to SCAN-COUNTER of the disclosed 
BIST (see Figure 7). The bit-fixing sequence generation logic, which 
corresponds to the decoder of the disclosed BIST that generates 
overriding signals, generates control signals to covert useless pseudo- 
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random sequences to deterministic testcubes. The columns Lit Count 
show the literal count of the multilevel bit-fixing sequence generation 
logics. Results published in, Touba'96 and Karkala et al are obtained 
by applying 10,000 test patterns in all cases. See N. A. Touba and E. J. 
McCluskey, Altering a Pseudo-Random Bit Sequence for Scan-Based 
BIST, In Proceedings IEEE International Test Conference, pages 167- 
175, 1996; and M. Karkala, N A. Touba, and H.-J. Wunderlich. Special 
ATPG to Correlate Test Patterns for Low-Overhead Mixed-Mode BIST, 
In proceedings 7yrd Asian Test Symposium, 1998. 

The column labeled Hard Fault of H.-J. Wunderlich shows the 
number of non-redundant faults that remain undetected after applying 
10,000 pseudo-random patterns. See H.-J. Wunderlich and G. Kiefer. 
Bit-Flipping BIST. In Proceedings VLSI Testing Symposium, pages 
337-343, 1996. The column labeled Prod. Term shows the numbers of 
product terms required to implement the bit-flipping functions, which 
correspond to the bit-fixing sequence generation logics of and the 
decoders in the disclosed BIST. See N. A. Touba and E. j. McCluskey, 
Altering a Pseudo-Random Bit Sequence for Scan-Based BIST, In 
Proceedings IEEE International Test Conference, pages 167-175, 
1996; and M. Karkala, N A. Touba, and H.-J. Wunderlich. Special ATPG 
to Correlate Test Patterns for Low-Overhead Mixed-Mode BIST, In 
proceedings 7?rd Asian Test Symposium, 1998. The bit-flipping 
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functions for all benchmark circuits are designed to achieve 100% fault 
coverage when 10,000 patterns are applied. For all cases, a 32 stage 
LFSR is used to generate pseudo-random sequences. Note that H.-J. 
Wunderlich considers only hard faults that remain undetected after 
10,000 patterns applied. See H.-J. Wunderlich and G. Kiefer. Bit- 
Flipping BIST, In Proceedings VLSI Testing Symposium, pages 337- 
343, 1996. Hence, 10,000 patterns that are applied to detect easy 
faults are included, total test patterns applied to each circuit are 
20,000. 

Both the number of product terms and literal count required to 
implement the decoder are reported for the disclosed BIST. Even 
though, the sequences generated by the disclosed BISTs are shorter 
than those generated by other BISTs, hardware overhead required to 
implement the disclosed BISTs is lowest for all circuits except sll96. 
(10,000 patterns for N. A. Touba and E. J. McCluskey, Altering a 
Pseudo-Random Bit Sequence for Scan-Based BIST, In Proceedings 
IEEE International Test Conference, pages 167-175, 1996; and M. 
Karkala, IM A. Touba, and H.-J. Wunderlich. Special ATPG to Correlate 
Test Patterns for Low-Overhead Mixed-Mode BIST, In proceedings 7yrd 
Asian Test Symposium, 1998; and 20,000 patterns for H.-J. 
Wunderlich and G. Kiefer. Bit-Flipping BIST, In Proceedings VLSI 
Testing Symposium, pages 337-343, 1996). 
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Also the sizes of Sequence ID. registers of, which correspond to 
COUNTERS of the disclosed BIST, are larger than the size of 
COUNTERS of the disclosed BIST for all circuits. See N. A. Touba and 
E. J. McCluskey, Altering a Pseudo-Random Bit Sequence for Scan- 
Based BIST, In Proceedings IEEE International Test Conference, pages 
167-175, 1996; and M. Karkala, N A. Touba, and H.-j. Wunderlich. 
Special ATPG to Correlate Test Patterns for Low-Overhead Mixed-Mode 
BIST, In proceedings 7yrd Asian Test Symposium, 1998. Especially, 
for large circuits that have many hard faults such as c7552 c2670, 
hardware overhead required to implement the disclosed BISTs 
significantly lower than that to implement other BISTs. 

IV.F. Conclusions 

Techniques to reduce test sequence length and hardware 
overhead in 3-weight WRPT BIST (test-per-clock as well as test-per- 
scan) are taught. In the disclosed 3-weight WRPT BIST, inputs are 
fixed to required values by overriding random patterns signals when 
the corresponding overriding signals are activated. The hardware 
overhead due to implementing 3-weight WRPT BIST is typically 
determined by the number of generators and overriding signals. In 
order to reduce the number of generators and overriding signals, an 
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improved ATPG that is an improvement over PODEM is disclosed that 
generates special testcubes for hard-to-detect faults. The selection in 
the disclosed ATPG is guided by the cost functions that estimate the 
number of generators and overriding signals required to achieve each 
5 objective. Since the disclosed BIST has a very simple architecture, the 
circuits to control the BIST can be implemented at low cost. 
Furthermore, the disclosed improved ATPG also minimizes the number 
of overriding signals as well as the number of generators. Finally, the 
overriding signals generated by the disclosed improved ATPG are 

10 further minimized by compatibility analysis and reordering scan chains. 
Experimental results for synthesis benchmark circuits demonstrate 
that hardware overhead required by the disclosed method is very low. 

Since the disclosed method generates only suitable testcubes 
that have the least number of conflicting inputs with testcubes in the 

15 current testcube set, the procedure to select suitable testcubes from 
large pool of testcubes, that typically has high time complexity, is not 
required. Furthermore, since the complexity of computing cost 
functions is linear in the number of circuit lines, the time complexity of 
the disclosed improved ATPG is almost same as that of simple 

20 combinational ATPGs. Hence the disclosed BIST design methodology 
is applicable to large circuits. 
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Unlike some conventional techniques that require specially 
designed flip-flops, the disclosed technique does not require any 
special library modules. Hence, the disclosed technique can be 
implemented under any ASIC environment without designing special 
5 library modules to support the disclosed technique. Even though, only 
LFSRS are used to generate pure random pattern sequences in the 
experiments performed for this paper, any random pattern generator, 
which includes even traditional weighted random pattern generators 
such as M. Bershteyn, can be used to generate random pattern 

10 sequence for the disclosed BIST. See M. Bershteyn, Calculation of 
Multiple Sets of Weights for Weighted Random Testing, In Proceedings 
IEEE International Test Conference, pages 1031-1040, 1993. In 
addition, the disclosed BIST can be used combined with other design- 
for-test techniques such as test point insertion. 

15 In some circuits, many inputs need to be fixed to set a single 

internal line to a desired value. If such lines are directly set to desired 
values by inserting test points, the number of overriding signals can 
also be reduced. 

Other modifications and variations to the invention will be 

20 apparent to those skilled in the art from the foregoing disclosure and 
teachings. Thus, while only certain embodiments of the invention have 
been specifically described herein, it will be apparent that numerous 
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modifications may be made thereto without departing from the spirit and 
scope of the invention. 
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